<html>
<head>
<title>OGR ElasticSearch Driver</title>
</head>

<body bgcolor="#ffffff">

<h1>ElasticSearch:  Geographically Encoded Objects for ElasticSearch</h1>

(Driver available in GDAL 1.10 or later)</br>

Driver is read-write starting with GDAL 2.1 (was write only in GDAL 2.0 or earlier)
<p>

<a href="http://elasticsearch.org/">ElasticSearch</a> is an Enterprise-level
search engine for a variety of data sources. It supports full-text indexing and
geospatial querying of those data in a fast and efficient manor using a predefined REST API.

<h2>Opening dataset name syntax</h2>

Starting with GDAL 2.1, the driver supports reading existing indices from a
ElasticSearch host. There are two main possible syntaxes to open a dataset:
<ul>
<li>Using <i>ES:http://hostname:port</i> (where port is typically 9200)</li>
<li>Using <i>ES:</i> with the open options to specify HOST and PORT</li>
</ul>

The open options available are :
<ul>
<li><b>HOST</b>=hostname: Server hostname. Default to localhost.</li>
<li><b>PORT</b>=port. Server port. Default to 9200.</li>
<li><b>BATCH_SIZE</b>=number. Number of features to retrieve per batch. Default is 100.</li>
<li><b>FEATURE_COUNT_TO_ESTABLISH_FEATURE_DEFN</b>=number. Number of features to
retrieve to establish feature definition. -1 = unlimited. Defaults to 100.</li>
<li><b>JSON_FIELD</b>=YES/NO. Whether to include a field called "_json" with
the full document as JSON. Defaults to NO.</li>
<li><b>FLATTEN_NESTED_ATTRIBUTE</b>=YES/NO. Whether to recursively explore nested
objects and produce flatten OGR attributes. Defaults to YES.</li>
<li><b>FID</b>=string. Field name, with integer values, to use as FID. Defaults to 'ogc_fid'</li>
</ul>

<h2>ElasticSearch vs OGR concepts</h2>

Each mapping type inside a ElasticSearch index will be considered as a OGR layer.
A ElasticSearch document is considered as a OGR feature.<p>

<h2>Field definitions</h2>

Fields are dynamically mapped from the input OGR data source. However, the
driver will take advantage of advanced options within ElasticSearch as defined
in a <a href="http://code.google.com/p/ogr2elasticsearch/wiki/ModifyingtheIndex">field mapping file</a>.

<p>
The mapping file allows you to modify the mapping according to the
<a href="http://www.elasticsearch.org/guide/reference/mapping/core-types.html">ElasticSearch field-specific types</a>.
There are many options to choose from, however, most of the functionality is
based on all the different things you are able to do with text fields within ElasticSearch.
<pre>
ogr2ogr -progress --config ES_WRITEMAP /path/to/file/map.txt -f "ElasticSearch" http://localhost:9200 my_shapefile.shp
</pre>
<p>

The ElasticSearch writer supports the following Configuration Options. Starting
with GDAL 2.1, layer creation options are also available and should be preferred:
<ul>
<li> <b>ES_WRITEMAP</b>=/path/to/mapfile.txt. Creates a mapping file that can be
modified by the user prior to insert in to the index. No feature will be written.
Note that this will properly work only if only one single layer is created.
Starting with GDAL 2.1, the <b>WRITE_MAPPING</b> layer creation option should rather be used.<br>
<li> <b>ES_META</b>=/path/to/mapfile.txt. Tells the driver to the user-defined
field mappings. Starting with GDAL 2.1, the <b>MAPPING</b> layer creation option should rather be used.<br>
<li> <b>ES_BULK</b>=10000. Identifies the maximum size in bytes of the buffer
to store documents to be inserted at a time. Lower record counts help with
memory consumption within ElasticSearch but take longer to insert.
Starting with GDAL 2.1, the <b>BULK_SIZE</b> layer creation option should rather be used.<br>
<li> <b>ES_OVERWRITE</b>=1. Overwrites the current index by deleting an existing one.
Starting with GDAL 2.1, the <b>OVERWRITE</b> layer creation option should rather be used.<br>
</ul>
<p>

<h2>Geometry types</h2>

In GDAL 2.0 and earlier, the driver was limited in the geometry it handles:
even if polygons were provided as input, they were stored as
<a href="http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-geo-point-type.html">geo point</a>
and the "center" of the polygon is used as value of the point.
Starting with GDAL 2.1,
<a href="https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-geo-shape-type.html">geo_shape</a>
is used to store all geometry types (except curve geometries that are not handled
by ElasticSearch and will be approximated to their linear equivalents).
</p>

<h2>Filtering</h2>

The driver will forward any spatial filter set with SetSpatialFilter() to the server.<p>

However, in the current state, SQL attribute filters set with SetAttributeFilter()
are evaluated only on client-side. To enable server-side filtering, the string
passed to SetAttributeFilter() must be a JSon object in the
<a href="https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-filters.html">ElasticSearch filter syntax</a>.
e.g. <pre>{ "filter": { "term": { "EAS_ID": 169 } } }</pre>
<p>

Note: if defining an Elastic Search JSon filter, the spatial filter specified
through SetSpatialFilter() will be ignored, and must thus be included in the JSon
filter if needed.<p>

<h2>Paging</h2>

Features are retrieved from the server by chunks of 100. This can be
altered with the BATCH_SIZE open option.<p>

<h2>Schema</h2>

When reading a Elastic Search index/type, OGR must establish the schema of attribute and geometry
fields, since OGR has a fixed schema concept.<p>

In the general case, OGR will read the mapping definition and the first 100 documents (can be altered with
the FEATURE_COUNT_TO_ESTABLISH_FEATURE_DEFN open option) of the index/type and build
the schema that best fit to the found fields and values.<p>

It is also possible to set the JSON_FIELD=YES open option so that a _json special
field is added to the OGR schema. When reading Elastic Search documents as OGR features,
the full JSon version of the document will be stored in the _json field. This might
be useful in case of complex documents or with data types that do not translate well
in OGR data types. On creation/update of documents, if the _json field is present
and set, its content will be used directly (other fields will be ignored).<p>

<h2>Feature ID</h2>

Elastic Search have a special _id field that contains the unique ID of the document. This
field is returned as an OGR field, but cannot be used as the OGR special FeatureID
field, which must be of integer type. By default, OGR will try to read a potential
'ogc_fid' field to set the OGR FeatureID. The name of this field to look up can
be set with the FID open option. If the field is not found, the FID returned by
OGR will be a sequential number starting at 1, but it is not guaranteed to be
stable at all.<p>

<h2>ExecuteSQL() interface</h2>

If specifying "ES" as the dialect of ExecuteSQL(), a JSon string with a
serialized <a href="https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-filters.html">Elastic Search filter</a>
can be passed. The search will be done on all indices and types, unless the
filter itself restricts the search. The returned layer will be a union of the
types returned by the FEATURE_COUNT_TO_ESTABLISH_FEATURE_DEFN first documents. It
will also contain the _index and _type special fields to indicate the provenance
of the features.<p>

The following filter can be used to restrict the search to the "poly" index and its "FeatureCollection" type mapping:
<pre>
{ "filter": {
    "indices" : {
        "no_match_filter": "none",
        "index": "poly",
        "filter": {
           "and" : [
             { "type": { "value": "FeatureCollection" } },
             { "term" : { "EAS_ID" : 158.0 } }
           ]
        }
      }
    }
}
</pre>

Aggregations are not supported.<p>

Standard SQL requests will be executed on client-side.<p>

<h2>Getting metadata</h2>

Getting feature count is efficient.<p>

Getting extent is efficient, only on geometry columns mapped to ElasticSearch type geo_point.
On geo_shape fields, feature retrieval of the whole layer is done, which might be slow.<p>

<h2>Write support</h2>

Index/type creation and deletion is possible.<p>

Write support is only enabled when the datasource is opened in update mode.<p>

When inserting a new feature with CreateFeature() in non-bulk mode, and if the command is successful, OGR will fetch the
returned _id and use it for the SetFeature() operation.<p>

<h2>Spatial reference system</h2>

Geometries stored in Elastic Search are supposed to be referenced as longitude/latitude
over WGS84 datum (EPSG:4326). On creation, the driver will automatically reproject
from the layer (or geometry field) SRS to EPSG:4326, provided that the input SRS
is set and that is not already EPSG:4326.<p>

<h2>Layer creation options</h2>

Starting with GDAL 2.1, the driver supports the following layer creation options:
<ul>
<li><b>INDEX_NAME</b>=name. Name of the index to create (or reuse). By default the index name is the layer name.</li>
<li><b>MAPPING_NAME=</b>=name. Name of the mapping type within the index.
By default, the mapping name is "FeatureCollection" and the documents will be
written as GeoJSON Feature objects. If another mapping name is chosen, a more "flat" structure will be used.</li>
<li><b>MAPPING</b>=filename or JSon. Filename from which to read a user-defined
mapping, or mapping as serialized JSon.</li>
<li><b>WRITE_MAPPING</b>=filename. Creates a mapping file that can be modified
by the user prior to insert in to the index. No feature will be written.
This option is exclusive with MAPPING.</li>
<li><b>OVERWRITE</b>=YES/NO. Whether to overwrite an existing collection with
the layer name to be created. Defaults to NO.</li>
<li><b>GEOMETRY_NAME</b>=name. Name of geometry column. Defaults to 'geometry'.</li>
<li><b>GEOM_MAPPING_TYPE</b>=AUTO/GEO_POINT/GEO_SHAPE. Mapping type for geometry fields. Defaults to AUTO.
GEO_POINT uses the <a href="https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-geo-point-type.html">geo_point</a>
mapping type. If used, the "centroid" of the geometry is used.
This is the behaviour of GDAL &lt; 2.1.
GEO_SHAPE uses the <a href="https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-geo-shape-type.html">geo_shape</a>
mapping type, compatible of all geometry types.
When using AUTO, if there is a single geometry field of type Point, a geo_point is used.
In other cases, geo_shape is used.</li>
<li><b>GEOM_PRECISION</b>={value}{unit}'. Desired geometry precision. Number followed by unit. For example 1m.
For a geo_point geometry field, this causes a compressed geometry format to be used.
This option is without effect if MAPPING is specified.</li>
<li><b>STORE_FIELDS</b>=YES/NO. Whether fields should be stored in the index.
Setting to YES sets the <a href="https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-core-types.html">"store" property</a> of the field mapping to "true" for all fields. Defaults to NO.
(Note: prior to GDAL 2.1, the default behaviour was to store fields)
This option is without effect if MAPPING is specified.</li>
<li><b>STORED_FIELDS</b>=List of comma separated field names that should be stored in the index.
Those fields will have their <a href="https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-core-types.html">"store" property</a> of the field mapping set to "true".
If all fields must be stored, then using STORE_FIELDS=YES is a shortcut.
This option is without effect if MAPPING is specified.</li>
<li><b>NOT_ANALYZED_FIELDS</b>=List of comma separated field names that should not be analyzed during indexing.
Those fields will have their <a href="https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-core-types.html">"index" property</a> of the field mapping set to "not_analyzed"
(the default in ElasticSearch is "analyzed"). A same field should not be
specified both in NOT_ANALYZED_FIELDS and NOT_INDEXED_FIELDS.
This option is without effect if MAPPING is specified.</li>
<li><b>NOT_INDEXED_FIELDS</b>=List of comma separated field names that should not be indexed.
Those fields will have their <a href="https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-core-types.html">"index" property</a> of the field mapping set to "no"
(the default in ElasticSearch is "analyzed"). A same field should not be
specified both in NOT_ANALYZED_FIELDS and NOT_INDEXED_FIELDS.
This option is without effect if MAPPING is specified.</li>
<li><b>BULK_INSERT</b>=YES/NO. Whether to use bulk insert for feature creation. Defaults to YES.</li>
<li><b>BULK_SIZE</b>=value. Size in bytes of the buffer for bulk upload. Defaults to 1000000 (1 million).</li>
<li><b>FID</b>=string. Field name, with integer values, to use as FID. Can be set to empty to disable the writing of the FID value. Defaults to 'ogc_fid'</li>
<li><b>DOT_AS_NESTED_FIELD</b>=YES/NO. Whether to consider dot character in field name as sub-document. Defaults to YES.</li>
<li><b>IGNORE_SOURCE_ID</b>=YES/NO. Whether to ignore _id field in features passed to CreateFeature(). Defaults to NO.</li>
</ul>

<h2>Examples</h2>

<b>Open the local store:</b></br>
<pre>
ogrinfo ES:
</pre>

<b>Open a remote store:</b></br>
<pre>
ogrinfo ES:http://example.com:9200
</pre>

<b>Filtering on a Elastic Search field:</b><br>
<pre>
ogrinfo -ro ES: my_type -where '{ "filter": { "term": { "EAS_ID": 168 } } }'
</pre>

<b>Using "match" query on Windows:</b><br>
On Windows the query must be between double quotes and double quotes inside the query must be escaped.
<pre>
C:\GDAL_on_Windows>ogrinfo ES: my_type -where "{\"query\": { \"match\": { \"NAME\": \"Helsinki\" } } }"
</pre>
<b>Load an ElasticSearch index with a shapefile:</b> </br>
<pre>
ogr2ogr -f "ElasticSearch" http://localhost:9200 my_shapefile.shp
</pre>

<b>Create a Mapping File:</b> </br>
The mapping file allows you to modify the mapping according to the
<a href="http://www.elasticsearch.org/guide/reference/mapping/core-types.html">ElasticSearch field-specific types</a>.
There are many options to choose from, however, most of the functionality is based
on all the different things you are able to do with text fields.
</br>
<pre>
ogr2ogr -progress --config ES_WRITEMAP /path/to/file/map.txt -f "ElasticSearch" http://localhost:9200 my_shapefile.shp
</pre>
or (GDAL &gt;= 2.1):
<pre>
ogr2ogr -progress -lco WRITE_MAPPING=/path/to/file/map.txt -f "ElasticSearch" http://localhost:9200 my_shapefile.shp
</pre>

<b>Read the Mapping File:</b> </br>
Reads the mapping file during the transformation</br>
<pre>
ogr2ogr -progress --config ES_META /path/to/file/map.txt -f "ElasticSearch" http://localhost:9200 my_shapefile.shp
</pre>
or (GDAL &gt;= 2.1):
<pre>
ogr2ogr -progress -lco MAPPING=/path/to/file/map.txt -f "ElasticSearch" http://localhost:9200 my_shapefile.shp
</pre>

<b>Bulk Uploading (for larger datasets):</b> </br>
Bulk loading helps when uploading a lot of data. The integer value is the number of bytes that are collection before being inserted.
</br>
<pre>
ogr2ogr -progress --config ES_BULK 10000 -f "ElasticSearch" http://localhost:9200 PG:"host=localhost user=postgres dbname=my_db password=password" "my_table" -nln thetable
</pre>
or (GDAL &gt;= 2.1):
<pre>
ogr2ogr -progress -lco BULK_SIZE=10000 -f "ElasticSearch" http://localhost:9200 my_shapefile.shp
</pre>

<b>Overwrite the current Index:</b> </br>
If specified, this will overwrite the current index. Otherwise, the data will be appended.
</br>
<pre>
ogr2ogr -progress --config ES_OVERWRITE 1 -f "ElasticSearch" http://localhost:9200 PG:"host=localhost user=postgres dbname=my_db password=password" "my_table" -nln thetable
</pre>
or (GDAL &gt;= 2.1):
<pre>
ogr2ogr -progress -overwrite ES:http://localhost:9200 PG:"host=localhost user=postgres dbname=my_db password=password" "my_table" -nln thetable
</pre>

<h2>See Also</h2>

<ul>
<li> <a href="http://elasticsearch.org/">Home page for ElasticSearch</a>
<li> <a href="http://code.google.com/p/ogr2elasticsearch/w/list">Examples Wiki</a>
<li> <a href="http://groups.google.com/group/ogr2elasticsearch">Google Group</a>

<p>
</ul>

</body>
</html>
